Introduction

In this notebook we demonstrate how to use Milo to detect abherrant cell states in diseased tissues, using a dataset of hepatic non-parenchymal cells isolated from 5 healthy and 5 cirrhotic human livers. Ramachandran et al. 2019 (GEO accessiion: GSE136103).

Installation

Added by DRT 2021-12-03

Updated by DRT 2023-11-29

source("getReqdPkgs.r")

pkgs <- c("SingleCellExperiment","scater","scran","Seurat","miloR",
  "tidyverse","patchwork","igraph","ggplot2",
  "ggrastr","msigdbr","clusterProfiler",
  "RColorBrewer","cowplot","devtools")

extra_pkgs <- c("irlba","Matrix") # NOTE these must be installed from source to make code work; see next code block

invisible(getReqdPkgs(pkgs))
'getOption("repos")' replaces Bioconductor standard repositories, see 'help("repositories", package = "BiocManager")' for
details.
Replacement repositories:
    CRAN: https://cran.rstudio.com/
All required CRAN and BioC packages have previously been installed

NOTE

As of 2023-11-29, the runPCA function in scater breaks due to change in Matrix library; must downgrade to Matrix 1.6.1 Downloaded v 1.6.1 from https://cran.r-project.org/src/contrib/Archive/Matrix/

installing from source (compiling on your computer, rather than installing from a binary) may also be needed. On a Mac, this requires gfortran to be installed. See here and link to download gfortran 12.2

See this link: https://github.com/bwlewis/irlba/issues/70

# install.packages("irlba", type = "source")
# install.packages("Matrix", type = "source") # , version="1.6.1"
library(scater)
Loading required package: SingleCellExperiment
Loading required package: SummarizedExperiment
Loading required package: MatrixGenerics
Loading required package: matrixStats

Attaching package: ‘MatrixGenerics’

The following objects are masked from ‘package:matrixStats’:

    colAlls, colAnyNAs, colAnys, colAvgsPerRowSet, colCollapse, colCounts, colCummaxs, colCummins,
    colCumprods, colCumsums, colDiffs, colIQRDiffs, colIQRs, colLogSumExps, colMadDiffs, colMads, colMaxs,
    colMeans2, colMedians, colMins, colOrderStats, colProds, colQuantiles, colRanges, colRanks, colSdDiffs,
    colSds, colSums2, colTabulates, colVarDiffs, colVars, colWeightedMads, colWeightedMeans,
    colWeightedMedians, colWeightedSds, colWeightedVars, rowAlls, rowAnyNAs, rowAnys, rowAvgsPerColSet,
    rowCollapse, rowCounts, rowCummaxs, rowCummins, rowCumprods, rowCumsums, rowDiffs, rowIQRDiffs, rowIQRs,
    rowLogSumExps, rowMadDiffs, rowMads, rowMaxs, rowMeans2, rowMedians, rowMins, rowOrderStats, rowProds,
    rowQuantiles, rowRanges, rowRanks, rowSdDiffs, rowSds, rowSums2, rowTabulates, rowVarDiffs, rowVars,
    rowWeightedMads, rowWeightedMeans, rowWeightedMedians, rowWeightedSds, rowWeightedVars

Loading required package: GenomicRanges
Loading required package: stats4
Loading required package: BiocGenerics

Attaching package: ‘BiocGenerics’

The following objects are masked from ‘package:stats’:

    IQR, mad, sd, var, xtabs

The following objects are masked from ‘package:base’:

    anyDuplicated, aperm, append, as.data.frame, basename, cbind, colnames, dirname, do.call, duplicated,
    eval, evalq, Filter, Find, get, grep, grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget,
    order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank, rbind, Reduce, rownames, sapply, setdiff,
    sort, table, tapply, union, unique, unsplit, which.max, which.min

Loading required package: S4Vectors

Attaching package: ‘S4Vectors’

The following object is masked from ‘package:utils’:

    findMatches

The following objects are masked from ‘package:base’:

    expand.grid, I, unname

Loading required package: IRanges
Loading required package: GenomeInfoDb
Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with 'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.


Attaching package: ‘Biobase’

The following object is masked from ‘package:MatrixGenerics’:

    rowMedians

The following objects are masked from ‘package:matrixStats’:

    anyMissing, rowMedians

Loading required package: scuttle
Loading required package: ggplot2
library(scran)
library(miloR)
Loading required package: edgeR
Loading required package: limma

Attaching package: ‘limma’

The following object is masked from ‘package:scater’:

    plotMDS

The following object is masked from ‘package:BiocGenerics’:

    plotMA


Attaching package: ‘edgeR’

The following object is masked from ‘package:SingleCellExperiment’:

    cpm
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────────────────────────────────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.4
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ lubridate 1.9.3     ✔ tibble    3.2.1
✔ purrr     1.0.2     ✔ tidyr     1.3.0── Conflicts ────────────────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
✖ lubridate::%within%() masks IRanges::%within%()
✖ dplyr::collapse()     masks IRanges::collapse()
✖ dplyr::combine()      masks Biobase::combine(), BiocGenerics::combine()
✖ dplyr::count()        masks matrixStats::count()
✖ dplyr::desc()         masks IRanges::desc()
✖ tidyr::expand()       masks S4Vectors::expand()
✖ dplyr::filter()       masks stats::filter()
✖ dplyr::first()        masks S4Vectors::first()
✖ dplyr::lag()          masks stats::lag()
✖ ggplot2::Position()   masks BiocGenerics::Position(), base::Position()
✖ purrr::reduce()       masks GenomicRanges::reduce(), IRanges::reduce()
✖ dplyr::rename()       masks S4Vectors::rename()
✖ lubridate::second()   masks S4Vectors::second()
✖ lubridate::second<-() masks S4Vectors::second<-()
✖ dplyr::slice()        masks IRanges::slice()
ℹ Use the ]8;;http://conflicted.r-lib.org/conflicted package]8;; to force all conflicts to become errors
library(patchwork)
library(igraph)

Attaching package: ‘igraph’

The following objects are masked from ‘package:lubridate’:

    %--%, union

The following objects are masked from ‘package:dplyr’:

    as_data_frame, groups, union

The following objects are masked from ‘package:purrr’:

    compose, simplify

The following object is masked from ‘package:tidyr’:

    crossing

The following object is masked from ‘package:tibble’:

    as_data_frame

The following object is masked from ‘package:miloR’:

    graph

The following object is masked from ‘package:scater’:

    normalize

The following object is masked from ‘package:GenomicRanges’:

    union

The following object is masked from ‘package:IRanges’:

    union

The following object is masked from ‘package:S4Vectors’:

    union

The following objects are masked from ‘package:BiocGenerics’:

    normalize, path, union

The following objects are masked from ‘package:stats’:

    decompose, spectrum

The following object is masked from ‘package:base’:

    union
library(ggplot2)
library(ggrastr)
library(msigdbr)
library(cowplot)

Attaching package: ‘cowplot’

The following object is masked from ‘package:patchwork’:

    align_plots

The following object is masked from ‘package:lubridate’:

    stamp
library(Matrix)

Attaching package: ‘Matrix’

The following objects are masked from ‘package:tidyr’:

    expand, pack, unpack

The following object is masked from ‘package:S4Vectors’:

    expand
library(irlba)
library(RColorBrewer)
library(SingleCellExperiment)

Define some global variables

Location to save output files and whether to overwrite output files.

OUTPUT_DIR <- "~/dropbox-vu/temp/milo_output"
OVERWRITE <- FALSE

Load data

We downloaded the dataset and annotations stored in Seurat object from here, as indicated by the authors.

# load("/nfs/team205/ed6/data/Ramachandran2019_liver/tissue.rdata")
load(url("https://www.dropbox.com/s/bq816h74gmh84gu/tissue.rdata?dl=1"))
Loading required package: Seurat
Loading required package: SeuratObject
Loading required package: sp

Attaching package: ‘sp’

The following object is masked from ‘package:IRanges’:

    %over%

‘SeuratObject’ was built under R 4.3.0 but the current version is 4.3.2; it is recomended that you reinstall
‘SeuratObject’ as the ABI for R may have changed
‘SeuratObject’ was built with package ‘Matrix’ 1.6.3 but the current version is 1.6.4; it is recomended that
you reinstall ‘SeuratObject’ as the ABI for ‘Matrix’ may have changed

Attaching package: ‘SeuratObject’

The following object is masked from ‘package:SummarizedExperiment’:

    Assays

The following object is masked from ‘package:GenomicRanges’:

    intersect

The following object is masked from ‘package:GenomeInfoDb’:

    intersect

The following object is masked from ‘package:IRanges’:

    intersect

The following object is masked from ‘package:S4Vectors’:

    intersect

The following object is masked from ‘package:BiocGenerics’:

    intersect

The following object is masked from ‘package:base’:

    intersect

Registered S3 method overwritten by 'data.table':
  method           from
  print.data.table     
Registered S3 method overwritten by 'htmlwidgets':
  method           from         
  print.htmlwidget tools:rstudio

Attaching package: ‘Seurat’

The following object is masked from ‘package:igraph’:

    components

The following object is masked from ‘package:SummarizedExperiment’:

    Assays
## Convert to SingleCellExperiment
liver_sce <- SingleCellExperiment(assay = list(counts=tissue@raw.data, logcounts=tissue@data),
                                  colData = tissue@meta.data)

liver_sce
class: SingleCellExperiment 
dim: 23498 58358 
metadata(0):
assays(2): counts logcounts
rownames(23498): FO538757.2 AP006222.2 ... CTA-126B4.7 LINC01423
rowData names(0):
colnames(58358): Healthy1_Cd45+_AAACCTGCAGTATCTG Healthy1_Cd45+_AACTGGTTCATGGTCA ...
  Cirrhotic3_Cd45-_TTTGTCATCCAGGGCT Cirrhotic3_Cd45-_TCTGGAAGTCATCCCT
colData names(10): nGene nUMI ... annotation_indepth annotation_lineage
reducedDimNames(0):
mainExpName: NULL
altExpNames(0):

Preprocessing

We use the same number of highly variable genes and principal components used by the authors of the original study.

Feature selection

Select highly variable genes

dec_liver <- modelGeneVar(liver_sce)

fit_liver <- metadata(dec_liver)
plot(fit_liver$mean, fit_liver$var, xlab="Mean of log-expression",
    ylab="Variance of log-expression")


hvgs <- getTopHVGs(dec_liver, n=3000)

Dimensionality reduction

set.seed(42)
liver_sce <- runPCA(liver_sce, subset_row=hvgs, ncomponents=11)
liver_sce <- runUMAP(liver_sce, dimred="PCA", ncomponents=2)
Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics'
Also defined by ‘spam’
Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics'
Also defined by ‘spam’
scater::plotUMAP(liver_sce, colour_by="condition", point_alpha=1,  point_size=0.5)

scater::plotUMAP(liver_sce, colour_by="dataset", point_alpha=0.3,  point_size=0.5)

scater::plotUMAP(liver_sce, colour_by="annotation_lineage", point_alpha=0.3,  point_size=0.5, text_by='annotation_lineage')

Notably, this dataset doesn’t appear to display a batch effect

Note: This step is not required. It generates a large file (even larger than raw data) and then reloads it.

# saveRDS(liver_sce, "~/dropbox-vu/temp/milo_data/Ramachandran2019_liver/liver_SCE_20210225.RDS")
# liver_sce <- readRDS(url("https://www.dropbox.com/s/z0aopf616b1urvj/liver_SCE_20210225.RDS?dl=1"))

Differential Abundance (DA) analysis with Milo

We test for differential abundance between healthy and cirrhotic livers. We start by defining neighbourhoods with refined sampling on the KNN graph. We inspect the size of neighbourhoods.

liver_milo <- Milo(liver_sce)

## Build KNN graph
liver_milo <- buildGraph(liver_milo, d = 11, k=30)
Constructing kNN graph with k:30
## Compute neighbourhoods with refined sampling
liver_milo <- makeNhoods(liver_milo, k=30, d=11, prop = 0.05, refined=TRUE)
Checking valid object
Running refined sampling with reduced_dim
plotNhoodSizeHist(liver_milo, bins=150)

Then we make a design matrix for the differential test, assigning samples to biological conditions.

colData(liver_milo)[['sort']] <- str_remove(colData(liver_milo)[['dataset']], ".+_")
colData(liver_milo)[['sort']] <- str_remove(colData(liver_milo)[['sort']], "A|B")

liver_meta <- as_tibble(colData(liver_milo)[,c("dataset","condition", 'sort')])
liver_meta <- distinct(liver_meta) %>%
  mutate(condition=factor(condition, levels=c("Uninjured", "Cirrhotic"))) %>%
  column_to_rownames("dataset")

Now we can count cells in neighbourhoods and perform the DA test.

NOTE: Executing calcNhoodDistance() function takes a long time (>30 min)

liver_milo <- countCells(liver_milo, samples = "dataset", meta.data = data.frame(colData(liver_milo)[,c("dataset","condition",'sort')]) )
Checking meta.data validity
Counting cells in neighbourhoods
liver_milo <- calcNhoodDistance(liver_milo, d=11)
'as(<dgTMatrix>, "dgCMatrix")' is deprecated.
Use 'as(., "CsparseMatrix")' instead.
See help("Deprecated") and help("Matrix-deprecated").
milo_res <- testNhoods(liver_milo, design = ~ condition, design.df = liver_meta[colnames(nhoodCounts(liver_milo)),])
Using TMM normalisation
Performing spatial FDR correction withk-distance weighting
milo_res_sort <- testNhoods(liver_milo, design = ~ sort + condition, design.df = 
                                liver_meta[colnames(nhoodCounts(liver_milo)),])
Using TMM normalisation
Performing spatial FDR correction withk-distance weighting
compare_da_df <- left_join(milo_res_sort, milo_res, by="Nhood", suffix=c("_sort", "_nosort")) %>%
  {annotateNhoods(liver_milo, ., 'annotation_lineage')} 

compare_da_df %>%
  ggplot(aes(-log10(SpatialFDR_sort), -log10(SpatialFDR_nosort))) +
  geom_point(size=0.8) +
  geom_point(data=. %>% filter(annotation_lineage=="Endothelia"), color="red")

plot(milo_res_sort$SpatialFDR, milo_res$SpatialFDR)

Exploration of Milo DA results

We can start by looking at some basic stats

pval_hist <- milo_res %>%
  ggplot(aes(PValue)) +
  geom_histogram(bins=50) +
  theme_bw(base_size=14)

volcano <-
  milo_res %>%
  ggplot(aes(logFC, -log10(SpatialFDR))) +
  geom_point(size=0.4, alpha=0.2) +
  geom_hline(yintercept = -log10(0.1)) +
  xlab("log-Fold Change") +
  theme_bw(base_size=14)

pval_hist + volcano

The distribution of P-values looks sensible and from the volcano plot we can see that we have identified some DA neighbourhoods at 10% FDR.

We can visualize DA neighbourhoods building an abstracted graph

liver_milo <- buildNhoodGraph(liver_milo)
plotNhoodGraphDA(liver_milo, milo_res, alpha = 0.1, size_range=c(2,6))

Note: This step is not required. It generates a large file (even larger than raw data) and then reloads it.

## Save milo object and results
# saveRDS(liver_milo,"~/dropbox-vu/temp/milo_data/Ramachandran2019_liver/liver_milo_20210225.RDS")
# write_csv(milo_res,"~/dropbox-vu/temp/milo_data/Ramachandran2019_liver/liver_results_20210225.csv")
# liver_milo <- readRDS("~/liver_milo_20201008.RDS")
liver_milo <- readRDS(url("https://www.dropbox.com/s/xdp07789c5hoen3/liver_milo_20210225.RDS?dl=1"))
# milo_res <- read_csv("/nfs/team205/ed6/data/Ramachandran2019_liver/liver_results_20201008.csv")
milo_res <- read_csv(url("https://www.dropbox.com/s/i1l3aep1py5wirf/liver_results_20210225.csv?dl=1"))

Making figures for the manuscript


colourCount = length(unique(liver_milo$annotation_lineage))
getPalette = colorRampPalette(brewer.pal(9, "Set2"))
Warning: n too large, allowed maximum for palette Set2 is 8
Returning the palette you asked for with that many colors
umap_df <- data.frame(reducedDim(liver_milo, "UMAP"))
colnames(umap_df) <- c("UMAP_1", "UMAP_2")

umap1 <- cbind(umap_df, annotation_lineage=liver_milo$annotation_lineage) %>%
  ggplot(aes(UMAP_1, UMAP_2, color=as.character(annotation_lineage))) +
  geom_point_rast(size=0.1, alpha=0.5, raster.dpi = 800) +
  ggrepel::geom_text_repel(data = . %>%
              group_by(annotation_lineage) %>%
              summarise(UMAP_1=mean(UMAP_1), UMAP_2=mean(UMAP_2)),
            aes(label=annotation_lineage), color="black", size=6
            ) +
  scale_color_manual(values=getPalette(colourCount)) +
  guides(color="none") +
  xlab("UMAP1") + ylab("UMAP2") +
  coord_fixed() +
  theme_classic(base_size = 22) +
  theme(axis.text = element_blank(), axis.ticks = element_blank())

fn_liver_umap1 <- file.path(OUTPUT_DIR,"/liver_v2/liver_umap1_new.pdf")

if(OVERWRITE | !file.exists(fn_liver_umap1)) {
  ggsave(fn_liver_umap1, height = 7, width = 8)
} else {
  message(cat(fn_liver_umap1,"exists and will not be overwritten\n"))
}
~/dropbox-vu/temp/milo_output//liver_v2/liver_umap1_new.pdf exists and will not be overwritten
umap2 <-
  cbind(umap_df, condition=as.character(liver_milo$condition)) %>%
  ggplot(aes(UMAP_1, UMAP_2, color=condition)) +
  geom_point_rast(size=0.1, alpha=0.5, raster.dpi = 800) +
  scale_color_brewer(palette="Set1", name='') +
  xlab("UMAP1") + ylab("UMAP2") +
  coord_fixed() +
  guides(color='none') +
  facet_wrap(condition~., ncol=1) +
  theme_nothing(font_size = 22) +
  theme(axis.text = element_blank(), axis.ticks = element_blank(), legend.position=c(0.9,0.9),
        strip.background = element_rect(color=NA), strip.text = element_text(size=22))


fn_liver_umap2 <- file.path(OUTPUT_DIR,"/liver_v2/liver_umap2_new.pdf")

if(OVERWRITE | !file.exists(fn_liver_umap1)) {
  ggsave(fn_liver_umap2, height = 7, width = 8)
} else {
  message(cat(fn_liver_umap2,"exists and will not be overwritten\n"))
}
~/dropbox-vu/temp/milo_output//liver_v2/liver_umap2_new.pdf exists and will not be overwritten
nh_graph_pl <- plotNhoodGraphDA(liver_milo, milo_res, alpha = 0.1, size_range=c(1,4)) +
  theme(legend.text = element_text(size=20), legend.title = element_text(size=22)) +
  coord_fixed()

fn_nh_graph <- file.path(OUTPUT_DIR,"/liver_v2/liver_graph_new.pdf")

if(OVERWRITE | !file.exists(fn_nh_graph)) {
  ggsave(fn_nh_graph, height = 7, width = 8)
} else {
  message(cat(fn_nh_graph,"exists and will not be overwritten\n"))
}
~/dropbox-vu/temp/milo_output//liver_v2/liver_graph_new.pdf exists and will not be overwritten
fig4_top <- (umap1 | umap2 | nh_graph_pl) +
  plot_layout(widths = c(3,1,3))

fig4_top

Explore DA neighbourhoods by cell type

Next, we can check the cell types where we observe most differences between healthy and cirrhotic cells, by taking the most frequent cell type in each neighbourhood.

milo_res <- milo_res[,!str_detect(colnames(milo_res), "annotation_lineage")]

# Add annotation of most frequent cell type per nhood to milo results table
milo_res <- annotateNhoods(liver_milo, milo_res, "annotation_indepth")
anno_df <- data.frame(liver_milo@colData) %>%
  distinct(annotation_lineage, annotation_indepth)
milo_res <- left_join(milo_res, anno_df, by="annotation_indepth")

We first check that neighbourhoods are sufficiently homogeneous

frac_hist <- ggplot(milo_res, aes(annotation_indepth_fraction)) +
  geom_histogram(bins=30) +
  xlab("Fraction of cells in \nmost abundant cluster") +
  ylab("# neighbourhoods") +
  theme_bw(base_size=14)

frac_hist

Filter nhoods with homogeneous composition

milo_res$annotation_indepth[milo_res$annotation_indepth_fraction < 0.6] <- NA
milo_res$annotation_lineage[milo_res$annotation_indepth_fraction < 0.6] <- NA

We can recover all the clusters where DA was detected in the original paper

group.by = "annotation_indepth"
paper_DA <- list(cirrhotic=c("MPs (4)","MPs (5)",
                             "Endothelia (6)", "Endothelia (7)",
                             "Mes (3)",
                             "Tcells (2)",
                             "Myofibroblasts"
                             ),
                 healthy=c("MPs (7)",
                           "Endothelia (1)",
                           "Tcells (1)", "Tcells (3)","Tcells (1)",
                           "ILCs (1)"
                           )
                 )

expDA_df <- bind_rows(
  data.frame(annotation_indepth = paper_DA[["cirrhotic"]], pred_DA="cirrhotic"),
  data.frame(annotation_indepth = paper_DA[["healthy"]], pred_DA="healthy")
  )

pl1 <- milo_res %>%
  left_join(expDA_df) %>%
  mutate(is_signif = ifelse(SpatialFDR < 0.1, 1, 0)) %>%
  mutate(logFC_color = ifelse(is_signif==1, logFC, NA)) %>%
  arrange(annotation_lineage) %>%
  mutate(Nhood=factor(Nhood, levels=unique(Nhood))) %>%
  filter(!is.na(annotation_lineage)) %>%
  ggplot(aes(annotation_indepth, logFC, color=logFC_color)) +
  scale_color_gradient2() +
  guides(color="none") +
  xlab(group.by) + ylab("Log Fold Change") +
  ggbeeswarm::geom_quasirandom(alpha=1) +
  coord_flip() +
  facet_grid(annotation_lineage~., scales="free", space="free") +
  theme_bw(base_size=22) +
  theme(strip.text.y =  element_text(angle=0),
        axis.title.y = element_blank(), axis.text.y = element_blank(), axis.ticks.y = element_blank(),
        )

pl2 <- milo_res %>%
  left_join(expDA_df) %>%
  # dplyr::filter(!is.na(pred_DA)) %>%
  group_by(annotation_indepth) %>%
  summarise(pred_DA=dplyr::first(pred_DA), annotation_lineage=dplyr::first(annotation_lineage)) %>%
  mutate(end=ifelse(pred_DA=="healthy", 0, 1),
         start=ifelse(pred_DA=="healthy", 1, 0)) %>%
  filter(!is.na(annotation_lineage)) %>%
  ggplot(aes(annotation_indepth, start, xend = annotation_indepth, yend = end, color=pred_DA)) +
  geom_segment(size=1,arrow=arrow(length = unit(0.1, "npc"), type="closed")) +
  coord_flip() +
  xlab("annotation") +
  facet_grid(annotation_lineage~.,
    # annotation_lineage~"Ramachandran et al.\nDA predictions",
             scales="free", space="free") +
  # guides(color="none") +
  scale_color_brewer(palette="Set1", direction = -1,
                     labels=c("enriched in cirrhotic", "enriched in healthy"),
                     na.translate = F,
                     name="Ramachandran et al.\nDA predictions") +
  guides(color=guide_legend(ncol = 1)) +
  theme_bw(base_size=22) +
  ylim(-0.1,1.1) +
  theme(strip.text.y = element_blank(),strip.text.x = element_text(angle=90),
        plot.margin = unit(c(0,0,0,0), "cm"), panel.grid = element_blank(),
        axis.title.x = element_blank(), axis.text.x = element_blank(), axis.ticks.x = element_blank(),
        legend.position = "bottom")

fig4_bleft <- (pl2 + pl1 +
  plot_layout(widths=c(1,10), guides = "collect") & theme(legend.position = 'top', legend.justification = 0))

fn_liver_DAcomp <- file.path(OUTPUT_DIR,"/liver_v2/liver_DAcomparison.pdf")

if(OVERWRITE | !file.exists(fn_liver_DAcomp)) {
  ggsave(fn_liver_DAcomp, height = 13, width = 8)
} else {
  message(cat(fn_liver_DAcomp,"exists and will not be overwritten\n"))
}
~/dropbox-vu/temp/milo_output//liver_v2/liver_DAcomparison.pdf exists and will not be overwritten
# ggsave("~/mount/gdrive/milo/Figures/liver_v2/liver_DAcomparison.pdf", width=8, height = 13)
# ggsave("~/dropbox-vu/temp/milo_output/liver_v2/liver_DAcomparison.pdf", width=8, height = 13)
fig4_bleft

Close-up on Endothelial lineage

endo_milo <- scater::runUMAP(liver_milo[,liver_milo$annotation_lineage=="Endothelia"],  dimred='PCA')
Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics'
Also defined by ‘spam’
Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics'
Also defined by ‘spam’
plotUMAP(endo_milo, colour_by = "annotation_indepth")

umap_df <- data.frame(reducedDim(endo_milo, "UMAP"))
colnames(umap_df) <- c("UMAP_1", "UMAP_2")

endo_umap <- cbind(umap_df, condition=endo_milo$condition) %>%
   ggplot(aes(UMAP_1, UMAP_2, color=condition)) +
  geom_point(size=0.3, alpha=0.5) +
  scale_color_brewer(palette="Set1", name='') +
  xlab("UMAP1") + ylab("UMAP2") +
  coord_fixed() +
  guides(color="none") +
  facet_wrap(condition~., ncol=1) +
  theme_nothing() +
  theme(axis.text = element_blank(), axis.ticks = element_blank(), legend.position=c(0.9,0.9),
        strip.background = element_rect(color=NA), strip.text = element_text(size=22))
liver_milo2 <- liver_milo
subset.nhoods <- str_detect(milo_res$annotation_indepth, "Endo")
reducedDim(liver_milo2, "UMAP")[colnames(endo_milo),] <- reducedDim(endo_milo, "UMAP") 

endo_gr <-
  plotNhoodGraphDA(
  liver_milo2, milo_res,
  subset.nhoods = which(milo_res$annotation_lineage == "Endothelia"), 
  size_range=c(1,4),
  # ) =)[1:(length()-1)], 
  alpha = 0.1
  )  +
   theme(legend.text = element_text(size=20), legend.title = element_text(size=22))
  
fig4_bright1 <- ((endo_umap + endo_gr ) + 
  plot_layout(widths = c(1,2), 
                guides = "collect"
                )) 
fig4_bright1

Can’t get code below to run

Unlcear what the problem is; data structure?

# # nh_graph <- nhoodGraph(liver_milo)[subset.nhoods,subset.nhoods]
# 
# nh_graph <- nhoodGraph(liver_milo)[na.omit(subset.nhoods)]
# 
# nh_graph <- graph_from_adjacency_matrix(nh_graph)
# 
# col_vals <- colData(liver_milo)[as.numeric(vertex_attr(nh_graph)$name), colour_by]
# V(nh_graph)$colour_by <- ifelse(milo_res[subset.nhoods,"SpatialFDR"] > 0.1, 0, milo_res[subset.nhoods,"logFC"])
# ggraph(simplify(nh_graph)) +
#       geom_edge_link0(edge_colour = "grey66", edge_alpha=0.2)   +
#       geom_node_point(aes(fill = colour_by), shape=21, size=2) +
#   scale_fill_gradient2()

Close-up on Cholangiocytes

chol_milo <- scater::runUMAP(liver_milo[,liver_milo$annotation_lineage=="Cholangiocytes"],  dimred='PCA')
Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics'
Also defined by ‘spam’
Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics'
Also defined by ‘spam’
plotUMAP(chol_milo, colour_by = "annotation_indepth")


plotUMAP(chol_milo, colour_by = "percent.mito")

Filter out cells that show contamination from immune cells (expression of immune markers)

keep <- logcounts(chol_milo)["CD19",] == 0 | logcounts(chol_milo)["MS4A1",] == 0
chol_milo <- chol_milo[,keep]
chol_milo <- scater::runUMAP(chol_milo,  dimred='PCA')
Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics'
Also defined by ‘spam’
Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics'
Also defined by ‘spam’
plotUMAP(chol_milo, colour_by = "annotation_indepth")

umap_df <- data.frame(reducedDim(chol_milo, "UMAP"))
colnames(umap_df) <- c("UMAP_1", "UMAP_2")

chol_umap <- cbind(umap_df, condition=chol_milo$condition) %>%
   ggplot(aes(UMAP_1, UMAP_2, color=condition)) +
  geom_point(size=0.3, alpha=0.5) +
  scale_color_brewer(palette="Set1", name='') +
  xlab("UMAP1") + ylab("UMAP2") +
  coord_fixed() +
  guides(color="none") +
  facet_wrap(condition~., ncol=1) +
  theme_nothing() +
  theme(axis.text = element_blank(), axis.ticks = element_blank(), legend.position=c(0.9,0.9),
        strip.background = element_rect(color=NA), strip.text = element_text(size=22))

chol_umap

liver_milo2 <- liver_milo
subset.nhoods <- milo_res$annotation_lineage=="Cholangiocytes"
reducedDim(liver_milo2, "UMAP")[colnames(chol_milo),] <- reducedDim(chol_milo, "UMAP") 

chol_gr <-
  plotNhoodGraphDA(
  liver_milo2, milo_res,
  subset.nhoods = subset.nhoods,
  size_range=c(2,5),
  # ) =)[1:(length()-1)], 
  alpha = 0.1
  )  +
   theme(legend.text = element_text(size=22), legend.title = element_text(size=24))
  
(chol_umap + chol_gr ) + 
  plot_layout(widths = c(1,2), 
                guides = "collect"
                )

# fig4_bright1 +
#   ggsave("~/milo_output/liver_endoGraph.pdf", width=9, height = 5)  

# ggsave("~/dropbox-vu/temp/milo_output/liver_endoGraph.pdf", width=9, height = 5)  

Differential Gene Expression analysis

In a subset of lineages, we want to test for differential expression between neighbourhoods enriched in cirrhotic cells and neighbourhoods enriched.

Add nhood expression to speed-up plotting of heatmaps

liver_milo <- calcNhoodExpression(liver_milo, assay = "logcounts", subset.row = hvgs)
# DRT: stopping run-through here
Should be able to run everything above.

Endothelia

Rebuttal figure showcasing grouping

set.seed(42)
milo_res_endogroups <- groupNhoods(liver_milo, milo_res, max.lfc.delta = 2, overlap = 1)
Found 1385 DA neighbourhoods at FDR 10%
nhoodAdjacency found - using for nhood grouping
p1 <- plotNhoodGroups(liver_milo, milo_res_endogroups, 
                size_range=c(1,3)) 

milo_res_endogroups <- annotateNhoods(liver_milo, milo_res_endogroups, 'annotation_lineage')

p2 <- plotDAbeeswarm(milo_res_endogroups, group.by = 'NhoodGroup') +
  facet_grid(annotation_lineage~., scales="free", space="free")
Converting group_by to factor...
## Plot expression in T cell neighbourhoods
# markers_df <- read_csv("~/mount/gdrive/milo/STable3_Ramachandran.csv")
tcell_marker_genes <- 
  markers_df %>%
  filter(cluster %in% c("Tcell", "ILC")) %>%
  top_n(30, myAUC) %>%
  pull(gene)
Error: object 'markers_df' not found

Group endothelial cells by logFC and DA results

Calculate marker genes between the two groups

Visualize as volcano

Visualize as heatmap

(gene expression values are scaled between 0 and 1 for each gene)

GO term analysis

Cholangiocytes

Calculate marker genes between the two groups

Visualize as volcano

GO term analysis


Assemble figure

Assemble supplementary figure

---
title: "Milo: liver cirrhosis analysis"
output: 
  html_notebook:
    code_folding: hide
---

## Introduction

In this notebook we demonstrate how to use Milo to detect abherrant cell states in diseased tissues, using a dataset of hepatic non-parenchymal cells isolated from 5 healthy and 5 cirrhotic human livers. [Ramachandran et al. 2019](https://www.nature.com/articles/s41586-019-1631-3#Sec1) (GEO accessiion: GSE136103).

## Installation
#### Added by DRT 2021-12-03
#### Updated by DRT 2023-11-29
```{r Installation}
source("getReqdPkgs.r")

pkgs <- c("SingleCellExperiment","scater","scran","Seurat","miloR",
  "tidyverse","patchwork","igraph","ggplot2",
  "ggrastr","msigdbr","clusterProfiler",
  "RColorBrewer","cowplot","devtools")

extra_pkgs <- c("irlba","Matrix") # NOTE these must be installed from source to make code work; see next code block

invisible(getReqdPkgs(pkgs))
```


### NOTE
As of 2023-11-29, the runPCA function in scater breaks due to change in Matrix library; must downgrade to Matrix 1.6.1
Downloaded v 1.6.1 from https://cran.r-project.org/src/contrib/Archive/Matrix/  

installing from source (compiling on your computer, rather than installing from a binary) may also be needed. On a Mac, this requires gfortran to be installed.  See [here](https://mac.r-project.org/tools/) and [link to download gfortran 12.2](https://mac.r-project.org/tools/gfortran-12.2-universal.pkg)  

See this link: https://github.com/bwlewis/irlba/issues/70

```{r}
# install.packages("irlba", type = "source")
# install.packages("Matrix", type = "source") # , version="1.6.1"
```

```{r}
library(scater)
library(scran)
library(miloR)
library(tidyverse)
library(patchwork)
library(igraph)
library(ggplot2)
library(ggrastr)
library(msigdbr)
library(cowplot)
library(Matrix)
library(irlba)
library(RColorBrewer)
library(SingleCellExperiment)
```
## Define some global variables
Location to save output files and whether to overwrite output files.
```{r}
OUTPUT_DIR <- "~/dropbox-vu/temp/milo_output"
OVERWRITE <- FALSE
```


## Load data

We downloaded the dataset and annotations stored in Seurat object from [here](https://datashare.is.ed.ac.uk/handle/10283/3433), as indicated by the authors.

```{r}
# load("/nfs/team205/ed6/data/Ramachandran2019_liver/tissue.rdata")
load(url("https://www.dropbox.com/s/bq816h74gmh84gu/tissue.rdata?dl=1"))
## Convert to SingleCellExperiment
liver_sce <- SingleCellExperiment(assay = list(counts=tissue@raw.data, logcounts=tissue@data),
                                  colData = tissue@meta.data)

liver_sce
```

## Preprocessing

We use the same number of highly variable genes and principal components used by the authors of the original study. 

### Feature selection

Select highly variable genes

```{r Select hvgs}
dec_liver <- modelGeneVar(liver_sce)

fit_liver <- metadata(dec_liver)
plot(fit_liver$mean, fit_liver$var, xlab="Mean of log-expression",
    ylab="Variance of log-expression")

hvgs <- getTopHVGs(dec_liver, n=3000)
```

### Dimensionality reduction

```{r, fig.height=8, fig.width=8}
set.seed(42)
liver_sce <- runPCA(liver_sce, subset_row=hvgs, ncomponents=11)
liver_sce <- runUMAP(liver_sce, dimred="PCA", ncomponents=2)

scater::plotUMAP(liver_sce, colour_by="condition", point_alpha=1,  point_size=0.5)
scater::plotUMAP(liver_sce, colour_by="dataset", point_alpha=0.3,  point_size=0.5)
scater::plotUMAP(liver_sce, colour_by="annotation_lineage", point_alpha=0.3,  point_size=0.5, text_by='annotation_lineage')
```

Notably, this dataset doesn't appear to display a batch effect

### Note: This step is not required. It generates a large file (even larger than raw data) and then reloads it. 
```{r save-load intermed file 1, eval=FALSE, include=TRUE, echo=TRUE}
# saveRDS(liver_sce, "~/dropbox-vu/temp/milo_data/Ramachandran2019_liver/liver_SCE_20210225.RDS")
# liver_sce <- readRDS(url("https://www.dropbox.com/s/z0aopf616b1urvj/liver_SCE_20210225.RDS?dl=1"))
```

## Differential Abundance (DA) analysis with Milo

We test for differential abundance between healthy and cirrhotic livers. We start by defining neighbourhoods with refined sampling on the KNN graph. We inspect the size of neighbourhoods.

```{r}
liver_milo <- Milo(liver_sce)

## Build KNN graph
liver_milo <- buildGraph(liver_milo, d = 11, k=30)

## Compute neighbourhoods with refined sampling
liver_milo <- makeNhoods(liver_milo, k=30, d=11, prop = 0.05, refined=TRUE)
plotNhoodSizeHist(liver_milo, bins=150)
```

Then we make a design matrix for the differential test, assigning samples to biological conditions.

```{r}
colData(liver_milo)[['sort']] <- str_remove(colData(liver_milo)[['dataset']], ".+_")
colData(liver_milo)[['sort']] <- str_remove(colData(liver_milo)[['sort']], "A|B")

liver_meta <- as_tibble(colData(liver_milo)[,c("dataset","condition", 'sort')])
liver_meta <- distinct(liver_meta) %>%
  mutate(condition=factor(condition, levels=c("Uninjured", "Cirrhotic"))) %>%
  column_to_rownames("dataset")

```

Now we can count cells in neighbourhoods and perform the DA test.  

NOTE: Executing `calcNhoodDistance()` function takes a long time (>30 min)
```{r}
liver_milo <- countCells(liver_milo, samples = "dataset", meta.data = data.frame(colData(liver_milo)[,c("dataset","condition",'sort')]) )
liver_milo <- calcNhoodDistance(liver_milo, d=11)
milo_res <- testNhoods(liver_milo, design = ~ condition, design.df = liver_meta[colnames(nhoodCounts(liver_milo)),])
milo_res_sort <- testNhoods(liver_milo, design = ~ sort + condition, design.df = 
                                liver_meta[colnames(nhoodCounts(liver_milo)),])
```

```{r}
compare_da_df <- left_join(milo_res_sort, milo_res, by="Nhood", suffix=c("_sort", "_nosort")) %>%
  {annotateNhoods(liver_milo, ., 'annotation_lineage')} 

compare_da_df %>%
  ggplot(aes(-log10(SpatialFDR_sort), -log10(SpatialFDR_nosort))) +
  geom_point(size=0.8) +
  geom_point(data=. %>% filter(annotation_lineage=="Endothelia"), color="red")
plot(milo_res_sort$SpatialFDR, milo_res$SpatialFDR)
```


## Exploration of Milo DA results

We can start by looking at some basic stats

```{r}
pval_hist <- milo_res %>%
  ggplot(aes(PValue)) +
  geom_histogram(bins=50) +
  theme_bw(base_size=14)

volcano <-
  milo_res %>%
  ggplot(aes(logFC, -log10(SpatialFDR))) +
  geom_point(size=0.4, alpha=0.2) +
  geom_hline(yintercept = -log10(0.1)) +
  xlab("log-Fold Change") +
  theme_bw(base_size=14)

pval_hist + volcano
```

The distribution of P-values looks sensible and from the volcano plot we can see that we have identified some DA neighbourhoods at 10% FDR.

We can visualize DA neighbourhoods building an abstracted graph

```{r, fig.width=14, fig.height=10}
liver_milo <- buildNhoodGraph(liver_milo)
plotNhoodGraphDA(liver_milo, milo_res, alpha = 0.1, size_range=c(2,6))
```

### Note: This step is not required. It generates a large file (even larger than raw data) and then reloads it. 
```{r save intermed files 2, eval=FALSE, include=TRUE, echo=TRUE}
## Save milo object and results
# saveRDS(liver_milo,"~/dropbox-vu/temp/milo_data/Ramachandran2019_liver/liver_milo_20210225.RDS")
# write_csv(milo_res,"~/dropbox-vu/temp/milo_data/Ramachandran2019_liver/liver_results_20210225.csv")

```

```{r load intermed files 2, eval=FALSE, include=TRUE, echo=TRUE}
# liver_milo <- readRDS("~/liver_milo_20201008.RDS")
liver_milo <- readRDS(url("https://www.dropbox.com/s/xdp07789c5hoen3/liver_milo_20210225.RDS?dl=1"))
# milo_res <- read_csv("/nfs/team205/ed6/data/Ramachandran2019_liver/liver_results_20201008.csv")
milo_res <- read_csv(url("https://www.dropbox.com/s/i1l3aep1py5wirf/liver_results_20210225.csv?dl=1"))
```


### Making figures for the manuscript

```{r, fig.width=15, fig.height=10}

colourCount = length(unique(liver_milo$annotation_lineage))
getPalette = colorRampPalette(brewer.pal(9, "Set2"))

umap_df <- data.frame(reducedDim(liver_milo, "UMAP"))
colnames(umap_df) <- c("UMAP_1", "UMAP_2")

umap1 <- cbind(umap_df, annotation_lineage=liver_milo$annotation_lineage) %>%
  ggplot(aes(UMAP_1, UMAP_2, color=as.character(annotation_lineage))) +
  geom_point_rast(size=0.1, alpha=0.5, raster.dpi = 800) +
  ggrepel::geom_text_repel(data = . %>%
              group_by(annotation_lineage) %>%
              summarise(UMAP_1=mean(UMAP_1), UMAP_2=mean(UMAP_2)),
            aes(label=annotation_lineage), color="black", size=6
            ) +
  scale_color_manual(values=getPalette(colourCount)) +
  guides(color="none") +
  xlab("UMAP1") + ylab("UMAP2") +
  coord_fixed() +
  theme_classic(base_size = 22) +
  theme(axis.text = element_blank(), axis.ticks = element_blank())

fn_liver_umap1 <- file.path(OUTPUT_DIR,"/liver_v2/liver_umap1_new.pdf")

if(OVERWRITE | !file.exists(fn_liver_umap1)) {
  ggsave(fn_liver_umap1, height = 7, width = 8)
} else {
  message(cat(fn_liver_umap1,"exists and will not be overwritten\n"))
}


umap2 <-
  cbind(umap_df, condition=as.character(liver_milo$condition)) %>%
  ggplot(aes(UMAP_1, UMAP_2, color=condition)) +
  geom_point_rast(size=0.1, alpha=0.5, raster.dpi = 800) +
  scale_color_brewer(palette="Set1", name='') +
  xlab("UMAP1") + ylab("UMAP2") +
  coord_fixed() +
  guides(color='none') +
  facet_wrap(condition~., ncol=1) +
  theme_nothing(font_size = 22) +
  theme(axis.text = element_blank(), axis.ticks = element_blank(), legend.position=c(0.9,0.9),
        strip.background = element_rect(color=NA), strip.text = element_text(size=22))


fn_liver_umap2 <- file.path(OUTPUT_DIR,"/liver_v2/liver_umap2_new.pdf")

if(OVERWRITE | !file.exists(fn_liver_umap1)) {
  ggsave(fn_liver_umap2, height = 7, width = 8)
} else {
  message(cat(fn_liver_umap2,"exists and will not be overwritten\n"))
}

nh_graph_pl <- plotNhoodGraphDA(liver_milo, milo_res, alpha = 0.1, size_range=c(1,4)) +
  theme(legend.text = element_text(size=20), legend.title = element_text(size=22)) +
  coord_fixed()

fn_nh_graph <- file.path(OUTPUT_DIR,"/liver_v2/liver_graph_new.pdf")

if(OVERWRITE | !file.exists(fn_nh_graph)) {
  ggsave(fn_nh_graph, height = 7, width = 8)
} else {
  message(cat(fn_nh_graph,"exists and will not be overwritten\n"))
}

```


```{r , fig.width=15, fig.height=10}
fig4_top <- (umap1 | umap2 | nh_graph_pl) +
  plot_layout(widths = c(3,1,3))

fig4_top
```

### Explore DA neighbourhoods by cell type

Next, we can check the cell types where we observe most differences between healthy and cirrhotic cells, by taking the most frequent cell type in each neighbourhood.

```{r, fig.width=9, fig.height=10}
milo_res <- milo_res[,!str_detect(colnames(milo_res), "annotation_lineage")]

# Add annotation of most frequent cell type per nhood to milo results table
milo_res <- annotateNhoods(liver_milo, milo_res, "annotation_indepth")
anno_df <- data.frame(liver_milo@colData) %>%
  distinct(annotation_lineage, annotation_indepth)
milo_res <- left_join(milo_res, anno_df, by="annotation_indepth")
```

We first check that neighbourhoods are sufficiently homogeneous

```{r}
frac_hist <- ggplot(milo_res, aes(annotation_indepth_fraction)) +
  geom_histogram(bins=30) +
  xlab("Fraction of cells in \nmost abundant cluster") +
  ylab("# neighbourhoods") +
  theme_bw(base_size=14)

frac_hist
```

Filter nhoods with homogeneous composition

```{r}
milo_res$annotation_indepth[milo_res$annotation_indepth_fraction < 0.6] <- NA
milo_res$annotation_lineage[milo_res$annotation_indepth_fraction < 0.6] <- NA
```


We can recover all the clusters where DA was detected in the original paper

```{r, fig.width=10, fig.height=10, warning=FALSE, message=FALSE}
group.by = "annotation_indepth"
paper_DA <- list(cirrhotic=c("MPs (4)","MPs (5)",
                             "Endothelia (6)", "Endothelia (7)",
                             "Mes (3)",
                             "Tcells (2)",
                             "Myofibroblasts"
                             ),
                 healthy=c("MPs (7)",
                           "Endothelia (1)",
                           "Tcells (1)", "Tcells (3)","Tcells (1)",
                           "ILCs (1)"
                           )
                 )

expDA_df <- bind_rows(
  data.frame(annotation_indepth = paper_DA[["cirrhotic"]], pred_DA="cirrhotic"),
  data.frame(annotation_indepth = paper_DA[["healthy"]], pred_DA="healthy")
  )

pl1 <- milo_res %>%
  left_join(expDA_df) %>%
  mutate(is_signif = ifelse(SpatialFDR < 0.1, 1, 0)) %>%
  mutate(logFC_color = ifelse(is_signif==1, logFC, NA)) %>%
  arrange(annotation_lineage) %>%
  mutate(Nhood=factor(Nhood, levels=unique(Nhood))) %>%
  filter(!is.na(annotation_lineage)) %>%
  ggplot(aes(annotation_indepth, logFC, color=logFC_color)) +
  scale_color_gradient2() +
  guides(color="none") +
  xlab(group.by) + ylab("Log Fold Change") +
  ggbeeswarm::geom_quasirandom(alpha=1) +
  coord_flip() +
  facet_grid(annotation_lineage~., scales="free", space="free") +
  theme_bw(base_size=22) +
  theme(strip.text.y =  element_text(angle=0),
        axis.title.y = element_blank(), axis.text.y = element_blank(), axis.ticks.y = element_blank(),
        )

pl2 <- milo_res %>%
  left_join(expDA_df) %>%
  # dplyr::filter(!is.na(pred_DA)) %>%
  group_by(annotation_indepth) %>%
  summarise(pred_DA=dplyr::first(pred_DA), annotation_lineage=dplyr::first(annotation_lineage)) %>%
  mutate(end=ifelse(pred_DA=="healthy", 0, 1),
         start=ifelse(pred_DA=="healthy", 1, 0)) %>%
  filter(!is.na(annotation_lineage)) %>%
  ggplot(aes(annotation_indepth, start, xend = annotation_indepth, yend = end, color=pred_DA)) +
  geom_segment(size=1,arrow=arrow(length = unit(0.1, "npc"), type="closed")) +
  coord_flip() +
  xlab("annotation") +
  facet_grid(annotation_lineage~.,
    # annotation_lineage~"Ramachandran et al.\nDA predictions",
             scales="free", space="free") +
  # guides(color="none") +
  scale_color_brewer(palette="Set1", direction = -1,
                     labels=c("enriched in cirrhotic", "enriched in healthy"),
                     na.translate = F,
                     name="Ramachandran et al.\nDA predictions") +
  guides(color=guide_legend(ncol = 1)) +
  theme_bw(base_size=22) +
  ylim(-0.1,1.1) +
  theme(strip.text.y = element_blank(),strip.text.x = element_text(angle=90),
        plot.margin = unit(c(0,0,0,0), "cm"), panel.grid = element_blank(),
        axis.title.x = element_blank(), axis.text.x = element_blank(), axis.ticks.x = element_blank(),
        legend.position = "bottom")

fig4_bleft <- (pl2 + pl1 +
  plot_layout(widths=c(1,10), guides = "collect") & theme(legend.position = 'top', legend.justification = 0))

fn_liver_DAcomp <- file.path(OUTPUT_DIR,"/liver_v2/liver_DAcomparison.pdf")

if(OVERWRITE | !file.exists(fn_liver_DAcomp)) {
  ggsave(fn_liver_DAcomp, height = 13, width = 8)
} else {
  message(cat(fn_liver_DAcomp,"exists and will not be overwritten\n"))
}
# ggsave("~/mount/gdrive/milo/Figures/liver_v2/liver_DAcomparison.pdf", width=8, height = 13)
# ggsave("~/dropbox-vu/temp/milo_output/liver_v2/liver_DAcomparison.pdf", width=8, height = 13)
```

```{r fig.width=8, fig.height=13}
fig4_bleft
```


### Close-up on Endothelial lineage

```{r}
endo_milo <- scater::runUMAP(liver_milo[,liver_milo$annotation_lineage=="Endothelia"],  dimred='PCA')
plotUMAP(endo_milo, colour_by = "annotation_indepth")
```

```{r}
umap_df <- data.frame(reducedDim(endo_milo, "UMAP"))
colnames(umap_df) <- c("UMAP_1", "UMAP_2")

endo_umap <- cbind(umap_df, condition=endo_milo$condition) %>%
   ggplot(aes(UMAP_1, UMAP_2, color=condition)) +
  geom_point(size=0.3, alpha=0.5) +
  scale_color_brewer(palette="Set1", name='') +
  xlab("UMAP1") + ylab("UMAP2") +
  coord_fixed() +
  guides(color="none") +
  facet_wrap(condition~., ncol=1) +
  theme_nothing() +
  theme(axis.text = element_blank(), axis.ticks = element_blank(), legend.position=c(0.9,0.9),
        strip.background = element_rect(color=NA), strip.text = element_text(size=22))
```

```{r, fig.width=8, fig.height=4, message=FALSE, warning=FALSE}
liver_milo2 <- liver_milo
subset.nhoods <- str_detect(milo_res$annotation_indepth, "Endo")
reducedDim(liver_milo2, "UMAP")[colnames(endo_milo),] <- reducedDim(endo_milo, "UMAP") 

endo_gr <-
  plotNhoodGraphDA(
  liver_milo2, milo_res,
  subset.nhoods = which(milo_res$annotation_lineage == "Endothelia"), 
  size_range=c(1,4),
  # ) =)[1:(length()-1)], 
  alpha = 0.1
  )  +
   theme(legend.text = element_text(size=20), legend.title = element_text(size=22))
  
fig4_bright1 <- ((endo_umap + endo_gr ) + 
  plot_layout(widths = c(1,2), 
                guides = "collect"
                )) 
fig4_bright1
```


### Can't get code below to run
Unlcear what the problem is; data structure?
```{r}
# # nh_graph <- nhoodGraph(liver_milo)[subset.nhoods,subset.nhoods]
# 
# nh_graph <- nhoodGraph(liver_milo)[na.omit(subset.nhoods)]
# 
# nh_graph <- graph_from_adjacency_matrix(nh_graph)
# 
# col_vals <- colData(liver_milo)[as.numeric(vertex_attr(nh_graph)$name), colour_by]
# V(nh_graph)$colour_by <- ifelse(milo_res[subset.nhoods,"SpatialFDR"] > 0.1, 0, milo_res[subset.nhoods,"logFC"])
# ggraph(simplify(nh_graph)) +
#       geom_edge_link0(edge_colour = "grey66", edge_alpha=0.2)   +
#       geom_node_point(aes(fill = colour_by), shape=21, size=2) +
#   scale_fill_gradient2()
```


### Close-up on Cholangiocytes

```{r}
chol_milo <- scater::runUMAP(liver_milo[,liver_milo$annotation_lineage=="Cholangiocytes"],  dimred='PCA')
plotUMAP(chol_milo, colour_by = "annotation_indepth")

plotUMAP(chol_milo, colour_by = "percent.mito")
```

Filter out cells that show contamination from immune cells (expression of immune markers)

```{r}
keep <- logcounts(chol_milo)["CD19",] == 0 | logcounts(chol_milo)["MS4A1",] == 0
chol_milo <- chol_milo[,keep]
chol_milo <- scater::runUMAP(chol_milo,  dimred='PCA')

plotUMAP(chol_milo, colour_by = "annotation_indepth")
```

```{r, fig.width=10, fig.height=8}
umap_df <- data.frame(reducedDim(chol_milo, "UMAP"))
colnames(umap_df) <- c("UMAP_1", "UMAP_2")

chol_umap <- cbind(umap_df, condition=chol_milo$condition) %>%
   ggplot(aes(UMAP_1, UMAP_2, color=condition)) +
  geom_point(size=0.3, alpha=0.5) +
  scale_color_brewer(palette="Set1", name='') +
  xlab("UMAP1") + ylab("UMAP2") +
  coord_fixed() +
  guides(color="none") +
  facet_wrap(condition~., ncol=1) +
  theme_nothing() +
  theme(axis.text = element_blank(), axis.ticks = element_blank(), legend.position=c(0.9,0.9),
        strip.background = element_rect(color=NA), strip.text = element_text(size=22))

chol_umap
```

```{r, fig.width=8, fig.height=4, warning=FALSE, message=FALSE}
liver_milo2 <- liver_milo
subset.nhoods <- milo_res$annotation_lineage=="Cholangiocytes"
reducedDim(liver_milo2, "UMAP")[colnames(chol_milo),] <- reducedDim(chol_milo, "UMAP") 

chol_gr <-
  plotNhoodGraphDA(
  liver_milo2, milo_res,
  subset.nhoods = subset.nhoods,
  size_range=c(2,5),
  # ) =)[1:(length()-1)], 
  alpha = 0.1
  )  +
   theme(legend.text = element_text(size=22), legend.title = element_text(size=24))
  
(chol_umap + chol_gr ) + 
  plot_layout(widths = c(1,2), 
                guides = "collect"
                )
# fig4_bright1 +
#   ggsave("~/milo_output/liver_endoGraph.pdf", width=9, height = 5)  

# ggsave("~/dropbox-vu/temp/milo_output/liver_endoGraph.pdf", width=9, height = 5)  
```

### Differential Gene Expression analysis

In a subset of lineages, we want to test for differential expression between neighbourhoods enriched in cirrhotic cells and neighbourhoods enriched.  

Add nhood expression to speed-up plotting of heatmaps

```{r}
liver_milo <- calcNhoodExpression(liver_milo, assay = "logcounts", subset.row = hvgs)
```

-----
# DRT: stopping run-through here
Should be able to run everything above.
-----

## Endothelia

Rebuttal figure showcasing grouping

```{r, fig.height=5, fig.width=14}
set.seed(42)
milo_res_endogroups <- groupNhoods(liver_milo, milo_res, max.lfc.delta = 2, overlap = 1)

p1 <- plotNhoodGroups(liver_milo, milo_res_endogroups, 
                size_range=c(1,3)) 

milo_res_endogroups <- annotateNhoods(liver_milo, milo_res_endogroups, 'annotation_lineage')

p2 <- plotDAbeeswarm(milo_res_endogroups, group.by = 'NhoodGroup') +
  facet_grid(annotation_lineage~., scales="free", space="free")


## Plot expression in T cell neighbourhoods
# markers_df <- read_csv("~/mount/gdrive/milo/STable3_Ramachandran.csv")
tcell_marker_genes <- 
  markers_df %>%
  filter(cluster %in% c("Tcell", "ILC")) %>%
  top_n(30, myAUC) %>%
  pull(gene)

p3 <- plotNhoodExpressionGroups(liver_milo, milo_res_endogroups, features = unique(tcell_marker_genes), 
                          subset.nhoods = milo_res_endogroups$NhoodGroup %in% c("3","10", "14"),
                          scale=TRUE, cluster_features = TRUE,show_rownames = TRUE
                          ) +
  theme(strip.text.x = element_text(angle=90))

```
```{r, fig.width=15, fig.height=10}
(((p1 + theme())/ (p3 + theme(strip.text = element_text(size=10, angle=45)))) + 
  plot_layout(heights = c(1.1,1), guides="collect"
              )| 
  (
    p2 + theme_bw(base_size=16) + theme(strip.text.y = element_text(angle=0))
    )) + 
  plot_layout(widths = c(1.4, 1)) +
  plot_annotation(tag_levels = c("A", "C", "B") ) +
  # ggsave("~/mount/gdrive/milo/Figures/liver_v2/RFig_grouping.pdf", width=15, height = 12) +
  # ggsave("~/mount/gdrive/milo/Figures/liver_v2/RFig_grouping.png", width=15, height = 12)
```

Group endothelial cells by logFC and DA results

```{r}
milo_res_endogroups$annotation_indepth[milo_res_endogroups$annotation_indepth_fraction < 0.6] <- NA
milo_res_endogroups$annotation_lineage[milo_res_endogroups$annotation_indepth_fraction < 0.6] <- NA

## Group neighbourhoods by DA outcome
milo_res_endogroups$NhoodGroup <- NA
milo_res_endogroups$NhoodGroup <- ifelse((milo_res_endogroups$annotation_lineage == "Endothelia") & (milo_res_endogroups$SpatialFDR < 0.1) & (milo_res_endogroups$logFC < -2.5), "54", milo_res_endogroups$NhoodGroup)
milo_res_endogroups$NhoodGroup <- ifelse((milo_res_endogroups$annotation_lineage == "Endothelia") & (milo_res_endogroups$SpatialFDR < 0.1) & (milo_res_endogroups$logFC > 2.5), "70", milo_res_endogroups$NhoodGroup)


liver_milo2 <- liver_milo
subset.nhoods <- str_detect(milo_res$annotation_indepth, "Endo")
reducedDim(liver_milo2, "UMAP")[colnames(endo_milo),] <- reducedDim(endo_milo, "UMAP") 
endo_gr_groups <- plotNhoodGroups(liver_milo2, milo_res_endogroups[milo_res_endogroups$annotation_lineage=="Endothelia",], 
                show_groups = c("54", "70"),
                size_range=c(1,4),
                subset.nhoods = milo_res_endogroups$annotation_lineage=="Endothelia") +
  scale_fill_manual(values=c("54"=brewer.pal(4, "Spectral")[2], "70"=brewer.pal(4, "Spectral")[3]), 
                    labels=c("54"="Uninjured group", '70'= "Cirrhotic group"),
                    na.value="white",
                    name = "Nhood group"
                    ) +
  theme(legend.text = element_text(size=20), legend.title = element_text(size=22))
```

```{r, fig.width=18, fig.height=5}
fig4_bright1 <- ((endo_umap + endo_gr) + 
  plot_layout(widths = c(1,2), guides="collect"
                )) &
  theme(legend.box = "horizontal", legend.position = "top", legend.direction = "vertical")
fig4_bright1
```


Calculate marker genes between the two groups
```{r}
mito_genes <- str_detect(hvgs, "^MT-")
markers_df <- findNhoodGroupMarkers(liver_milo, da.res = milo_res_endogroups, assay="counts",
                      subset.nhoods = (milo_res_endogroups$NhoodGroup %in% c("54", "70")),
                      subset.groups = c("54", "70"),
                      subset.row = hvgs[!mito_genes],
                      aggregate.samples = TRUE, sample_col = "dataset"
                      )

milo_res_endogroups[milo_res_endogroups$NhoodGroup %in% c("54", "70"),]

colnames(markers_df) <- str_replace(colnames(markers_df), "70", "cirr")
colnames(markers_df) <- str_replace(colnames(markers_df), "54", "uninj")
```

#### Visualize as volcano 

```{r, fig.height=6, fig.width=10, message=FALSE, warning=FALSE}

highlight_genes <- c("PLVAP", "VWA1", "ACKR1", "IL32",
                     "CLEC4G", "CLEC4M", "FCN2", "FCN3",
                     "LEF1")

marker.df <- markers_df
marker.df %>%
  mutate(label=ifelse(GeneID %in% highlight_genes, GeneID, NA)) %>%
  ggplot(aes(logFC_cirr, -log10(adj.P.Val_cirr), 
             # color=highlight
             )) + 
  geom_point() +
  geom_text(aes(label=label), color="red") +
  xlab("logFC") + ylab("- log10(Adj. P value)") +
  theme_bw(base_size = 22)
  
```


#### Visualize as heatmap 
(gene expression values are scaled between 0 and 1 for each gene)

```{r, fig.height=10, fig.width=12, message=FALSE, warning=FALSE}
marker_genes <- marker.df %>%
  dplyr::filter(adj.P.Val_cirr < 0.05) %>%
  pull(GeneID)

fig4_bbright <-
  plotNhoodExpressionDA(liver_milo, milo_res_endogroups, c(marker_genes), cluster_features = TRUE, assay = "counts",
                      alpha = 0.1,
                      scale_to_1 = TRUE,
                      subset.nhoods =  milo_res_endogroups$NhoodGroup %in% c("54", "70"),
                      # grid.space = "free",
                      highlight_features = highlight_genes, show_rownames = FALSE
                      ) +
  ylab("DE genes")+
  # facet_grid(.~NhoodGroup, scales="free", space="free")
   theme(legend.text = element_text(size=22), legend.title = element_text(size=24)) +
  plot_layout(heights = c(1,10)) & theme(legend.margin = margin(0,0,0,60), legend.background = element_blank())

  
pl3 <- fig4_bbright$data %>%
  ggplot(aes(logFC_rank, 1,fill=logFC)) +
  geom_tile() +
      theme_classic(base_size=16) +
    ylab("") +
  scale_fill_gradient2(name="DA logFC") +
    # scale_fill_manual(values=c("54"=brewer.pal(4, "Spectral")[2], "70"=brewer.pal(4, "Spectral")[3]), 
    #                 labels=c("54"="Uninjured group", '70'= "Cirrhotic group"),
    #                 na.value="white",
    #                 name = "Nhood group"
    #                 ) +
    scale_x_continuous(expand = c(0.01, 0)) +
    theme(axis.text = element_blank(), axis.ticks = element_blank(), axis.line = element_blank(), 
          axis.title = element_blank())

fig4_bbright <- pl3 / fig4_bbright  +
  plot_layout(heights = c(1,20))

fig4_bbright
```

### GO term analysis

```{r, echo=FALSE, warning=FALSE, message=FALSE}
# BiocManager::install('clusterProfiler')
# BiocManager::install('msigdbr')
library(clusterProfiler)
library(msigdbr)

m_df <- msigdbr(species = "Homo sapiens")
m_t2g <- msigdbr(species = "Homo sapiens", category = "C5", subcategory = "BP")  %>% 
  dplyr::select(gs_name, gene_symbol)
```

```{r, echo=FALSE, warning=FALSE, message=FALSE}
marker_genes_up <- marker.df %>%
  dplyr::filter(adj.P.Val_cirr < 0.05 & logFC_cirr > 0.5) %>%
  pull(GeneID) 

marker_genes_down <- marker.df %>%
  dplyr::filter(adj.P.Val_cirr < 0.05 & logFC_uninj > 0.5) %>%
  pull(GeneID)

em_up <- enricher(marker_genes_up, TERM2GENE=m_t2g, pAdjustMethod = "fdr", 
                  universe = hvgs
                  )
em_down <- enricher(marker_genes_down, TERM2GENE=m_t2g, pAdjustMethod = "fdr", 
                    universe = rownames(liver_milo)
                    )

em_res_up <- em_up@result[em_up@result$qvalue < 0.1,] %>%
  dplyr::select(- c(Description))
em_res_down <- em_down@result[em_down@result$qvalue < 0.1,] %>%
  dplyr::select(- c(Description))
```

```{r, fig.height=8, fig.width=15, warning=FALSE, message=FALSE}
go_endo_up <- em_res_up %>%
  top_n(30, -log10(qvalue)) %>%
   mutate(ID=ifelse(ID=='GO_ANTIGEN_PROCESSING_AND_PRESENTATION_OF_PEPTIDE_OR_POLYSACCHARIDE_ANTIGEN_VIA_MHC_CLASS_II', "GO_ANTIGEN_PRESENTATION_VIA_MHC_CLASS_II", ID)) %>%
  mutate(Term=factor(ID, levels=rev(unique(ID)))) %>%
  ggplot(aes(Term, -log10(qvalue))) +
  geom_point() +
  coord_flip() +
  xlab("GO Biological Function") + ylab("-log10(Adj. p-value)") +
  theme_bw(base_size=18) +
  ggtitle("Cirrhotic endothelia")

go_endo_down <- em_res_down %>%
  top_n(30, -log10(qvalue)) %>%
  mutate(ID=ifelse(ID=='GO_ANTIGEN_PROCESSING_AND_PRESENTATION_OF_PEPTIDE_OR_POLYSACCHARIDE_ANTIGEN_VIA_MHC_CLASS_II', "GO_ANTIGEN_PRESENTATION_VIA_MHC_CLASS_II", ID)) %>%
  mutate(Term=factor(ID, levels=rev(unique(ID)))) %>%
  ggplot(aes(Term, -log10(qvalue))) +
  geom_point() +
  coord_flip() +
  xlab("GO Biological Function") + ylab("-log10(Adj. p-value)") +
  theme_bw(base_size=18) +
  ggtitle("Uninjured endothelia")

go_endo_up
go_endo_down
```


```{r}
em_res_up
em_res_down
```

## Cholangiocytes

```{r}
set.seed(42)
milo_res_cholgroups <- groupNhoods(liver_milo, milo_res, max.lfc.delta = 0.5, overlap = 1)

## Group neighbourhoods by DA outcome
milo_res_cholgroups$NhoodGroup <- NA
milo_res_cholgroups$NhoodGroup <- ifelse((milo_res_cholgroups$annotation_lineage == "Cholangiocytes") & (milo_res_cholgroups$SpatialFDR < 0.1) & (milo_res_cholgroups$logFC < -2.5), "38", milo_res_cholgroups$NhoodGroup)
milo_res_cholgroups$NhoodGroup <- ifelse((milo_res_cholgroups$annotation_lineage == "Cholangiocytes") & (milo_res_cholgroups$SpatialFDR < 0.1) & (milo_res_cholgroups$logFC > 2.5), "49", milo_res_cholgroups$NhoodGroup)

liver_milo2 <- liver_milo
subset.nhoods <- str_detect(milo_res$annotation_indepth, "Chol")
reducedDim(liver_milo2, "UMAP")[colnames(chol_milo),] <- reducedDim(chol_milo, "UMAP") 
plotNhoodGroups(liver_milo2, milo_res_cholgroups[milo_res_cholgroups$annotation_lineage=="Cholangiocytes",], 
                show_groups = c("49","38"),
                subset.nhoods =  milo_res_cholgroups$annotation_lineage =="Cholangiocytes")

```

Calculate marker genes between the two groups
```{r}
## Filter genes expressed in cholangiocytes
# chol_hvgs <- hvgs[(counts(chol_milo)[hvgs,] > 0) %>% {rowSums(.)/ncol(chol_milo)} > 0.01]
mito_genes <- str_detect(hvgs, "^MT-")

markers_df <- findNhoodGroupMarkers(liver_milo, da.res = milo_res_cholgroups, assay="counts",
                      subset.nhoods = milo_res_cholgroups$NhoodGroup %in%c("49","38"),
                      subset.groups = c("49","38"),
                      subset.row = hvgs[!mito_genes],
                      aggregate.samples = TRUE, sample_col = "dataset"
                      )

markers_df 

milo_res_cholgroups[milo_res_cholgroups$NhoodGroup %in%c("49","38"),]
```

#### Visualize as volcano 

```{r, fig.height=6, fig.width=10, message=FALSE, warning=FALSE}
marker.df.chol <- markers_df

volcano_chol <-
  marker.df.chol %>%
  mutate(up=ifelse(logFC_49 > 0, "up", "down")) %>%
  group_by(up) %>%
  mutate(label=ifelse(rank(adj.P.Val_49) < 15, GeneID, NA)) %>%
  # mutate(label=ifelse((adj.P.Val_49 < 0.05 & logFC_49 < -3) | (adj.P.Val_49 < 0.05 & logFC_49 > 0), GeneID, NA)) %>%
  ggplot(aes(logFC_49, -log10(adj.P.Val_49), 
             # color=highlight
             )) + 
  geom_point(size=0.8, alpha=0.6) +
  ggrepel::geom_text_repel(aes(label=label), segment.alpha = 0.2) +
  xlab("logFC") + ylab("- log10(Adj. P value)") +
  theme_bw(base_size = 22)

volcano_chol  
  
```



### GO term analysis

```{r, echo=FALSE, warning=FALSE, message=FALSE}
marker_genes_chol <- marker.df.chol %>%
  dplyr::filter(adj.P.Val_49 < 0.05 & logFC_49 > 0) %>%
  pull(GeneID)

em_up_chol <- enricher(marker_genes_chol, TERM2GENE=m_t2g, pAdjustMethod = "fdr", 
                  universe = rownames(liver_milo)
                  )

em_res_up_chol <- em_up_chol@result[em_up_chol@result$qvalue < 0.1,] %>%
  dplyr::select(- c(Description))
```

```{r, fig.height=8, fig.width=15, warning=FALSE, message=FALSE}
go_chol_up <- em_res_up_chol %>%
  top_n(20, -log10(qvalue)) %>%
  mutate(Term=factor(ID, levels=rev(unique(ID)))) %>%
  ggplot(aes(Term, -log10(qvalue))) +
  geom_point() +
  coord_flip() +
  xlab("GO Biological Function") + ylab("-log10(Adj. p-value)") +
  theme_bw(base_size=18) +
  ggtitle("Cirrhotic cholangiocytes")

go_chol_up
```

```{r}
em_res_up_chol
```
```{r, echo=FALSE, warning=FALSE, message=FALSE}
marker_genes_chol_down <- marker.df.chol %>%
  dplyr::filter(adj.P.Val_49 < 0.05 & logFC_49 < 0) %>%
  pull(GeneID)

em_down_chol <- enricher(marker_genes_chol_down, TERM2GENE=m_t2g, pAdjustMethod = "fdr", 
                  universe = rownames(liver_milo)
                  )

em_res_down_chol <- em_down_chol@result[em_down_chol@result$qvalue < 0.1,] %>%
  dplyr::select(- c(Description))
```


---

Assemble figure
```{r, fig.height=25, fig.width=19}
fig4_bottom <- ((fig4_bleft + plot_layout()) |
      ((fig4_bright1 + plot_layout(tag_level = 'keep')) / (fig4_bbright + plot_layout())) +
      plot_layout(heights = c(1,1.6))
   ) +
  plot_layout(widths=c(1,1.4))

(fig4_top / fig4_bottom) +
  plot_layout(heights=c(1,1.8))  +
  ggsave("~/mount/gdrive/milo/Figures/liver_v2/fig4_raw.pdf", height = 26, width = 24, useDingbats=FALSE) 
  # ggsave("~/mount/gdrive/milo/Figures/liver_v2/fig4_raw.png", height = 26, width = 24, useDingbats=FALSE)
  # ggsave("~/milo/ms/figures/figs/figure5.pdf", height = 26, width = 22, useDingbats=FALSE)
```

Assemble supplementary figure

```{r, fig.width=25, fig.height=7}
p1 <- plot_grid( go_endo_up+ theme(plot.title = element_text(hjust = 1),
                                   axis.title.x = element_text(hjust = 1)), 
                 go_endo_down+ theme(plot.title = element_text(hjust = 1),
                                     axis.title.x = element_text(hjust = 1)), 
                 label_size = 18,
                 ncol=1,
                 rel_heights = c(2,2),
                labels = c("A", "B","C"))

p1

chol_emb <- (chol_umap + chol_gr ) + 
  plot_layout(widths = c(1,2), 
                guides = "collect"
                )

```

```{r, fig.height=10, fig.width=8}
plot_grid(
  go_endo_up+ theme(plot.title = element_text(hjust = 1),
                                   axis.title.x = element_text(hjust = 1)), 
                 go_endo_down+ theme(plot.title = element_text(hjust = 1),
                                     axis.title.x = element_text(hjust = 1)), 
                 label_size = 18,
                 ncol=1,
                 rel_heights = c(2,2), rel_widths = c(2,2),
                labels = c("A", "B")
  ) +
  ggsave("~/mount/gdrive/milo/Figures/liver_v2/suppl_fig_endo.pdf", height = 12, width=12) +
  ggsave("~/mount/gdrive/milo/Figures/liver_v2/suppl_fig_endo.png", height = 12, width=12)


```
```{r, fig.height=10, fig.width=8}
plot_grid(plot_grid(chol_umap, chol_gr, volcano_chol, nrow=1,rel_widths = c(1,2,2),
                          label_size = 18,
                labels = c("A","B","C")),
                go_chol_up + theme(plot.title = element_text(hjust = 1),
                                   axis.title.x = element_text(hjust = 1)), 
                ncol=1,
                rel_heights = c(1,1),
                 label_size = 18,
                labels=c("",'D')) +
   ggsave("~/mount/gdrive/milo/Figures/liver_v2/suppl_fig7.pdf", height = 13, width=14) +
  ggsave("~/mount/gdrive/milo/Figures/liver_v2/suppl_fig7.png", height = 13, width=14) 
```
